Finding Minimal Generalizations for Unions of Pattern Languages and Its Application to Inductive Inference from Positive Data

نویسندگان

  • Hiroki Arimura
  • Takeshi Shinohara
  • Setsuko Otsuki
چکیده

A pattern is a string of constant symbols and variables. The language de ned by a pattern p is the set of constant strings obtained from p by substituting nonempty constant strings for variables in p. In this paper we are concerning with polynomial time inference from positive data of the class of unions of a bounded number of pattern languages. We introduce a syntactic notion of minimal multiple generalizations (mmg for short) to study the inferability of classes of unions. If a pattern p is obtained from another pattern q by substituting nonempty patterns for variables in q, q is said to be more general than p. A set of patterns de nes a union of their languages. A set Q of patterns is said to be more general than a set P of patterns if for any pattern p in P there exists a more general pattern q in Q than p. Clearly more general set of patterns de nes larger unions. A k-minimal multiple generalization (k-mmg) of a set S of strings is a minimally general set of at most k patterns that de nes a union containing S. The syntactic notion of minimality enables us to e ciently compute a candidate for a semantically minimal concept. We present a general methodology for designing an e cient algorithm to nd a k-mmg. Under some conditions an mmg can be used as an appropriate hypothesis for inductive inference from positive data. As results several classes of unions of pattern languages are shown to be polynomial time inferable from positive data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Polynomial Time Algorithm for Finding Finite Unions of Tree Pattern Languages

A tree pattern is a structured pattern known as a term in formal logic, and a tree pattern language is the set of trees which are the ground instances of a tree pattern. In this paper, we deal with the class of tree languages whose language is de ned as a union of at most k tree pattern languages, where k is an arbitrary xed positive number. In particular, we present a polynomial time algorithm...

متن کامل

Characteristic Sets for Inferring the Unions of the Tree Pattern Languages by the Most Fitting Hypotheses

A tree pattern p is a first-order term in formal logic, and the language of p is the set of all the tree patterns obtainable by replacing each variable in p with a tree pattern containing no variables. We consider the inductive inference of the unions of these languages from positive examples using strategies that guarantee some forms of minimality during the learning process. By a result in ou...

متن کامل

Context-Free Language Induction by Evolution of Deterministic Push-Down Automata Using Genetic Programming<

The process of learning often consists of Inductive Inference, making generalizations from samples. The problem here is finding generalizations (Grammars) for Formal Languages from finite sets of positive and negative sample sentences. The focus of this paper is on Context-Free Languages (CFL’s) as defined by Context-Free Grammars (CFG’s), some of which are accepted by Deterministic Push-Down A...

متن کامل

Developments from enquiries into the learnability of the pattern languages from positive data

The pattern languages are languages that are generated from patterns, and were first proposed by Angluin as a nontrivial class that is inferable from positive data [D. Angluin, Finding patterns common to a set of strings, Journal of Computer and System Sciences 21 (1980) 46–62; D. Angluin, Inductive inference of formal languages from positive data, Information and Control 45 (1980) 117–135]. In...

متن کامل

Finite Automata and Unions of Regular Patterns with Bounded Constant Segments

The class of unbounded unions of regular pattern languages with bounded constant segments is identifiable from positive data in the limit [1]. Otherwise, no efficient algorithm that performs the inference of this class of languages is known. We propose a solution to this problem using the existing connexion between the positive variety of languages of dot depth 1/2, LJ [2] and the class of unbo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994